Search CORE

39 research outputs found

Code Prediction by Feeding Trees to Transformers

Author: Chandra Satish
Kim Seohyun
Tian Yuchi
Zhao Jinman
Publication venue
Publication date: 02/07/2020
Field of study

We advance the state-of-the-art in the accuracy of code prediction (next token prediction) used in autocomplete systems. First, we report that using the recently proposed Transformer architecture even out-of-the-box outperforms previous neural and non-neural systems for code prediction. We then show that by making the Transformer architecture aware of the syntactic structure of code, we further increase the margin by which a Transformer-based system outperforms previous systems. With this, it outperforms the accuracy of an RNN-based system (similar to Hellendoorn et al. 2018) by 18.3\%, the Deep3 system (Raychev et al 2016) by 14.1\%, and an adaptation of Code2Seq (Alon et al., 2018) for code prediction by 14.4\%. We present in the paper several ways of communicating the code structure to the Transformer, which is fundamentally built for processing sequence data. We provide a comprehensive experimental evaluation of our proposal, along with alternative design choices, on a standard Python dataset, as well as on a Facebook internal Python corpus. Our code and data preparation pipeline will be available in open source

arXiv.org e-Print Archive

Recommended from our members

Detect and Repair Errors for DNN-based Software

Author: Tian Yuchi
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2021
Field of study

Nowadays, deep neural networks based software have been widely applied in many areas including safety-critical areas such as traffic control, medical diagnosis and malware detection, etc. However, the software engineering techniques, which are supposed to guarantee the functionality, safety as well as fairness, are not well studied. For example, some serious crashes of DNN based autonomous cars have been reported. These crashes could have been avoided if these DNN based software were well tested. Traditional software testing, debugging or repairing techniques do not work well on DNN based software because there is no control flow, data flow or AST(Abstract Syntax Tree) in deep neural networks. Proposing software engineering techniques targeted on DNN based software are imperative. In this thesis, we first introduced the development of SE(Software Engineering) for AI(Artificial Intelligence) area and how our works have influenced the advancement of this new area. Then we summarized related works and some important concepts in SE for AI area. Finally, we discussed four important works of ours. Our first project DeepTest is one of the first few papers proposing systematic software testing techniques for DNN based software. We proposed neuron coverage guided image synthesis techniques for DNN based autonomous cars and leveraged domain specific metamorphic relation to generate oracle for new generated test cases to automatically test DNN based software. We applied DeepTest to testing three top performing self-driving car models in Udacity self-driving car challenge and our tool has identified thousands of erroneous behaviors that may lead to potential fatal crash. In DeepTest project, we found that the natural variation such as spatial transformations or rain/fog effects have led to problematic corner cases for DNN based self-driving cars. In the follow-up project DeepRobust, we studied per-point robustness of deep neural network under natural variation. We found that for a DNN model, some specific weak points are more likely to cause erroneous outputs than others under natural variation. We proposed a white-box approach and a black-box approach to identify these weak data points. We implemented and evaluated our approaches on 9 DNN based image classifiers and 3 DNN based self-driving car models. Our approaches can successfully detect weak points with good precision and recall for both DNN based image classifiers and self-driving cars. Most of existing works in SE for AI area including our DeepTest and DeepRobust focus on instance-wise errors, which are single inputs that result in a DNN model's erroneous outputs. Different from instance-wise errors, group-level errors reflect a DNN model's weak performance on differentiating among certain classes or inconsistent performance across classes. This type of errors is very concerning since it has been found to be related to many real-world notorious errors without malicious attackers. In our third project DeepInspect, we first introduced the group-level errors for DNN based software and categorized them into confusion errors and bias errors based on real-world reports. Then we proposed neuron coverage based distance metric to detect group-level errors for DNN based software without requiring labels. We applied DeepInspect to testing 8 pretrained DNN models trained in 6 popular image classification datasets, including three adversarial trained models. We showed that DeepInspect can successfully detect group-level violations for both single-label and multi-label classification models with high precision. As a follow-up and more challenging research project, we proposed five WR(weighted regularization) techniques to repair group-level errors for DNN based software. These five different weighted regularization techniques function at different stages of retraining or inference of DNNs including input phase, layer phase, loss phase and output phase. We compared and evaluated these five different WR techniques in both single-label and multi-label classifications including five combinations of four DNN architectures on four datasets. We showed that WR can effectively fix confusion and bias errors and these methods all have their pros, cons and applicable scenario. All our four projects discussed in this thesis have solved important problems in ensuring the functionality, safety as well as fairness for DNN based software and had significant influence in the advancement of SE for AI area

Columbia University Academic Commons

DeepSearch: A Simple and Effective Blackbox Attack for Deep Neural Networks

Although deep neural networks have been very successful in image-classification tasks, they are prone to adversarial attacks. To generate adversarial inputs, there has emerged a wide variety of techniques, such as black- and whitebox attacks for neural networks. In this paper, we present DeepSearch, a novel fuzzing-based, query-efficient, blackbox attack for image classifiers. Despite its simplicity, DeepSearch is shown to be more effective in finding adversarial inputs than state-of-the-art blackbox approaches. DeepSearch is additionally able to generate the most subtle adversarial inputs in comparison to these approaches

arXiv.org e-Print Archive

Crossref

MPG.PuRe

Long-term ex vivo monitoring of in vivo microRNA activity in liver using a secreted luciferase sensor

Author: A. T. Nguyen
A. Válóczi
B. A. Tannous
B. D. Brown
B. D. Brown
B. Zhang
C. L. Jopling
E. Chung
F. Liu
Gang Wang
H. J. Kim
J. Chen
J. H. Mansfield
J. Y. Lee
JianYang Hu
Jie Yuchi
L. J. Wolff
M. Al-Dosari
M. Lagos-Quintana
P. Mestdagh
R. Cawood
R. W. Carthew
S. Barth
T. Suzuki
T. Wurdinger
V. Ambros
V. N. Kim
WenHong Tian
XiaoBing Wu
XiaoYan Dong
Y. Wang
Yue Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Transcriptome analysis of pancreatic cells across distant species highlights novel important regulator genes

Author: A Mavropoulos
A Necsulea
A Rossi
A Villasenor
A-C Binot
AC Nica
AC Nica
AE Adriaenssens
AJ Davidson
AJ Vilella
Alice Bernard
AM Ackermann
AP Ghaye
Arnaud Lavergne
B Blum
B Wendik
Bernard Peers
BR Tennant
C Benner
C Cortijo
C Dorrell
C Lillesaar
C Thisse
C Trapnell
CM Benitez
D Brawand
D Ramsköld
David Bergemann
DC McIntyre
DL Eizirik
DM Blodgett
DW Huang
E Hummler
E Zecchin
EK Lee
Estefania Tarifeño-Saldivia
F Thorel
FC Pan
FW Pagliuca
G Bertrand
G Gu
G Kilic
G Tian
GA Martens
GK Varshney
GM Ku
GM Ku
H Brereton
HS Spijker
I Moran
I Ruiz de Azua
Isabelle Manfroid
J Li
JE Gunton
Keerthana Padamata
KM Kwan
L Fagerberg
L Godinho
L Ye
L-EE Jao
LC Flasse
LC Flasse
LC Murtaugh
M Baron
M Courtney
M Rebeiz
MA Hale
Marianne L. Voz
MD Kinkel
MD Kinkel
MI Love
MJ Muraro
MR DiGruccio
N Devos
N Inagaki
N Pishesha
NC Bramswig
PE Squires
RA Kimmel
RE Jennings
RJ Kinsella
RM Cripps
S Anders
S Anders
S Chera
S Mergler
S Wang
S Wang
SB Nelson
SE Flanagan
SP Moss
SR Holmstrom
T Gao
T Shay
VA Moran
W Zhang
Y Ohta
Y Xin
Y Yuchi
YJ Wang
Z Jiang
Z Li
Å Segerstolpe
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Bankruptcy effect on business competitors. : Empirical study of US companies

Author: Nassimbwa Justine
Tian Yuchi
Publication venue: Umeå universitet, Företagsekonomi
Publication date: 01/01/2013
Field of study

Bankruptcy is a negative event that not only affects the company in question but all stakeholders of society. Our research will focus on one stakeholder group, business competitors. How are competitors affected by bankruptcy announcements? Past research has tried to answer this question in different ways. Some compared two industries with different characteristics while others worked with multiple industries. Past researchers suggested and tested three independent variables that they thought influence the returns of competitors in the face of bankruptcy: leverage, size and industry concentration. We adopt a different perspective when researching this topic in that we focus on competitors that are close to the bankrupt firm (business competitors) as opposed to using all competitors in an industry. The purpose of our research is to investigate if a chapter 11 bankruptcy announcement has an influence on business competitors within the same economic sector during the time horizon 2004-2012. In order to explore this topic, we incorporate three independent variables: economic sector concentration, firm leverage and firm size, to study if different characteristics of different economic sectors and firms would affect the bankruptcy announcement effect. Based on the quantitative method, our research utilized secondary data to study the relationships between the three independent variables and bankruptcy announcement effect on competitors. We found that the best way to carry out this research is by using a deductive approach and quantitative method. The results of our research showed weak correlations between the three variables and the bankruptcy announcement effect, among which the concentration was the most determinant variable and size has the weakest effect. For both concentration and firm size, we found inverse relationships between these two variables and abnormal returns of the business competitors. The abnormal returns earned by the high leveraged firms were less than the low leveraged ones. The conclusions of our research were that the chapter 11 bankruptcy announcement indeed influence the stock returns of business competitors. The firms in highly concentrated economic sectors had contagion effect while competitive effect happened to the firms in low concentrated ones. The same conclusion was drawn in terms of the firm size. For the leverage, there was no conclusion regarding the contagion or competitive effect as the results were inconclusive

Publikationer från Umeå universitet

Digitala Vetenskapliga Arkivet - Academic Archive On-line